---
title: Set up data drift monitoring
description: Configure data drift monitoring on a deployment's Data Drift Settings tab.

---

# Set up data drift monitoring {: #set-up-data-drift-monitoring }

When deploying a model, there is a chance that the dataset used for training and validation differs from the prediction data. You can enable data drift monitoring on the **Data Drift > Settings** tab. DataRobot monitors both target and feature drift information and displays results on the [**Data Drift**](data-drift) tab.

{% include 'includes/how-dr-tracks-drift-include.md' %}

!!! info "Availability information"
    Data drift tracking is only available for deployments using deployment-aware prediction API routes (i.e., `https://example.datarobot.com/predApi/v1.0/deployments/<deploymentId>/predictions`).

On a deployment's **Data Drift Settings** page, you can configure the following settings:

![](images/data-drift-settings.png)

| Field                   | Description                |
|-------------------------|----------------------------|
|  **Data Drift**  | :~~: |
| Enable feature drift tracking | Configures DataRobot to track feature drift in a deployment. Training data is required for feature drift tracking. |
| Enable target monitoring | Configures DataRobot to track target drift in a deployment. Target monitoring is required for [accuracy monitoring](accuracy-settings). |
|  **Training data**  | :~~: |
| Training data       | Displays the dataset used as a training baseline while building a model. |
|  **Inference data** | :~~: |
| DataRobot is storing your predictions | Confirms DataRobot is recording and storing the results of any predictions made by this deployment. DataRobot stores a deployment's inference data when a deployment is created. It cannot be uploaded separately. |
|  **Inference data (external model)** | :~~: |
| DataRobot is recording the results of any predictions made against this deployment | Confirms DataRobot is recording and storing the results of any predictions made by the external model. |
| Drop file(s) here or choose file | Uploads a file with prediction history data to monitor data drift. |
|  **Definition**           | :~~: |
| [Set definition](#define-data-drift-monitoring-notifications) | Configures the drift and importance metric settings and threshold definitions for data drift monitoring. |
|  **Notifications**           | :~~: |
| [Send notification](#schedule-data-drift-monitoring-notifications) | Configures the schedule for data drift monitoring notification checks. |

!!! note
    DataRobot monitors both target and feature drift information by default and displays results in the [Data Drift dashboard](data-drift). Use the **Enable target monitoring** and **Enable feature drift tracking** toggles to turn off tracking if, for example, you have sensitive data that should not be monitored in the deployment. The **Enable target monitoring** setting is also required to enable [accuracy monitoring](accuracy-settings).

You can customize how data drift is monitored. See the data drift page for more information on [customizing data drift status](data-drift#customize-data-drift-status) for deployments.

## Define data drift monitoring notifications {: #define-data-drift-monitoring-notifications }

Drift assesses how the distribution of data changes across all features for a specified range. The thresholds you set determine the amount of drift you will allow before a notification is triggered.

!!! note
    Only deployment _Owners_ can modify data drift monitoring settings; however, _Users_ can [configure the conditions under which notifications are sent to them](deploy-notifications). _Consumers_ cannot modify monitoring or notification settings.

Use the **Definition** section of the **Data Drift > Settings** tab to set thresholds for drift and importance:

* Drift is a measure of how new prediction data differs from the original data used to train the model.

* Importance allows you to separate the features you care most about from those that are less important. 

For both drift and importance, you can visualize the thresholds and how they separate the features on the [Data Drift tab](data-drift). By default, the data drift status for deployments is marked as "Failing" (![](images/icon-red.png)) when at least one high-importance feature exceeds the set drift metric threshold; it is marked as "At Risk" (![](images/icon-yellow.png)) when no high-importance features, but at least one low-importance feature exceeds the threshold. 

Deployment _Owners_ can customize the rules used to calculate the drift status for each deployment. As a deployment _Owner_, you can:

* Define or override the list of high or low-importance features to monitor features that are important to you or put less emphasis on less important features.

* Exclude features expected to drift from drift status calculation and alerting so you do not get false alarms.

* Customize what "At Risk" and "Failing" drift statuses mean to personalize and tailor the drift status of each deployment to your needs.

To set up monitoring of drift status for a deployment:

1. On the **Data Drift Settings** page, in the **Definition** section, configure the settings for monitoring data drift:

    ![](images/data-drift-monitoring-definition.png)
    
    |  | Element | Description |
    |--|---------|-------------|
    | ![](images/icon-1.png) | Range | Adjusts the time range of the **Reference period**, which compares training data to prediction data. Select a time range from the dropdown menu. |
    | ![](images/icon-2.png) | Drift metric and threshold | Configures the thresholds of the drift metric. DataRobot only supports the Population Stability Index (PSI) metric. When drift thresholds are changed, the [Feature Drift vs. Feature Importance chart](data-drift#feature-drift-vs-feature-importance-chart) updates to reflect the changes. For more information, see the note on [Drift metric support](#drift-metric-support) below. |
    | ![](images/icon-3.png) | Importance metric and threshold | Configures the thresholds of the Importance metric. The Importance metric measures the most impactful features in the training data. DataRobot only supports the Permutation Importance metric. When drift thresholds are changed, the [Feature Drift vs. Feature Importance chart](data-drift#feature-drift-vs-feature-importance-chart) updates to reflect the changes. See an [example](#example-of-configuring-the-importance-and-drift-thresholds).|
    | ![](images/icon-4.png) | `X` excluded features | Excludes features (including the target) from drift status calculations. Click **`X` excluded features** to open a dialog box where you can enter the names of features to set as **Drift exclusions**. Excluded features do not affect drift status for the deployment but still display on the Feature Drift vs. Feature Importance chart. See an [example](#example-of-an-excluded-feature). |
    | ![](images/icon-5.png) | `X` starred features | Sets features to be treated as high importance even if they were initially assigned low importance. Click **`X` starred features** to open a dialog box where you can enter the names of features to set as **High-importance stars**. Once added, these features are assigned high importance. They ignore the importance thresholds, but still display on the Feature Drift vs. Feature Importance chart. See an [example](#example-of-starring-a-feature-to-assign-high-importance). |   
    | ![](images/icon-6.png) | "At Risk" / "Failing" thresholds  | Configures the values that trigger drift statuses for "At Risk" (![](images/icon-yellow.png)) and "Failing" (![](images/icon-red.png)). See an [example](#example-of-setting-a-drift-status-rule).|

    !!! note
        Changes to thresholds affect the periods of time in which predictions are made across the entire history of a deployment. These updated thresholds are reflected in the performance monitoring visualizations on the [Data Drift](data-drift) tab. 

2. After updating the data drift monitoring settings, click **Save**.

{% include 'includes/drift-metrics-support.md' %}
	
### Example of an excluded feature {: #example-of-an-excluded-feature }

In the example below, the excluded feature, which appears as a gray circle, would normally change the drift status to "Failing" (![](images/icon-red.png)). Because it is excluded, the status remains as Passing.

![](images/dd-status-4.png)

### Example of configuring the importance and drift thresholds {: #example-of-configuring-the-importance-and-drift-thresholds }

In the example below, the chart has adjusted the importance and drift thresholds (indicated by the arrows), resulting in more features "At Risk" and "Failing" than the chart above.

![](images/dd-status-8.png)

### Example of starring a feature to assign high importance {: #example-of-starring-a-feature-to-assign-high-importance }

In the example below, the starred feature, which appears as a white circle, would normally cause drift status to be "At Risk" due to its initially low importance. However, since it is assigned high importance, the feature will change the drift status to "Failing" (![](images/icon-red.png)).

![](images/dd-status-6.png)

### Example of setting a drift status rule {: #example-of-setting-a-drift-status-rule }

The following example configures the rule for a deployment to mark its drift status as "At Risk" if one of the following is true:

* The number of low-importance features above the drift threshold is greater than 1.

* The number of high-importance features above the drift threshold is greater than 3.

![](images/dd-status-9.png)

## Schedule notification checks {: #schedule-notification-checks }

To schedule recurring checks to determine if data drift monitoring email notifications should be sent:

1. On the **Data Drift Settings** page, in the **Notifications** section, enable **Send notifications**.

2. Configure the settings for data drift notifications. The following table lists the scheduling options. All times are displayed in UTC:

    | Frequency     | Description |
    |---------------|-------------|
    | Every day     | Each day at the selected time. |
    | Every week    | Each selected day at the selected time. |
    | Every month   | Each month, on each selected day, at the selected time. The selected days in a month are provided as numbers (`1` to `31`) in a comma-separated list.
    | Every quarter | Each month of a quarter, on each selected day, at the selected time. The selected days in each month are provided as numbers (`1` to `31`) in a comma-separated list.
    | Every year    | Each selected month, on each selected day, at the selected time. The selected days in each month are provided as numbers (`1` to `31`) in a comma-separated list. |
    | **Use advanced scheduler** | :~~: |
    | Minute       | Each minute defined in a comma-separated list of numbers between `0` and `59`, or `*` for all. |
    | Hour         | Each hour defined in a comma-separated list of numbers between `0` and `23`, or `*` for all.   |
    | Day of month | Each day defined in a comma-separated list of numbers between `1` and `31`, or `*` for all.    |
    | Month        | Each month defined in a comma-separated list of numbers between `1` and `12`, or `*` for all.  |
    | Day of week  | Each weekday defined in a comma-separated list of numbers between `0` and `6`, or `*` for all. |

3. After updating the scheduling settings, click **Save**.

    {% include 'includes/notification-check-include.md' %}